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Abstract 



Physical mechanisms underlying the empirical correlation between relative contact or- 
der (CO) and folding rate among naturally-occurring small single-domain proteins are 
investigated by evaluating postulated interaction schemes for a set of three-dimensional 
27mer lattice protein models with 97 different CO values. Many-body interactions are 
constructed such that contact energies become more favorable when short chain segments 
sequentially adjacent to the contacting residues adopt native-like conformations. At a 
given interaction strength, this scheme leads to folding rates that are logarithmically well 
correlated with CO (correlation coefficient r = 0.914) and span more than 2.5 orders of 
magnitude, whereas folding rates of the corresponding Go models with additive contact 
energies have much less logarithmic correlation with CO and span only approximately one 
order of magnitude. The present protein chain models also exhibit calorimetric coopera- 
tivity and linear chevron plots similar to that observed experimentally for proteins with 
apparent simple two-state folding/unfolding kinetics. Thus, our findings suggest that 
CO-dependent folding rates of real proteins may arise partly from a significant positive 
coupling between nonlocal contact favorabilities and local conformational preferences. 
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INTRODUCTION 



Generic protein properties as energetic constraints 

The folding of many small single-domain proteins is well approximated by simple 
two-state thermodynamics and kinetics. ^'^ In the past several years, we have shown that 
fundamental insights into protein energetics can be gained by using these general, appar- 
ently mundane properties as experimental constraints on protein chain models. This 
approach is based on the recognition that model interaction schemes capable of produc- 
ing these commonly observed experimental properties are, somewhat surprisingly, not 
entirely straightforward to come up with. To date, much advance has been made by 
coarse-grained modeling of protein folding. "^'^^"^^ Nonetheless, the interactions postu- 
lated by many existing models are insufficient for calorimetric two-state cooperativity.^'^ 
Furthermore, even common Go models arc not cooperative enough for simple two-state 
kinetics, their explicit native biases notwithstanding. Specifically, we recently found that 
several lattice^'^'^" and continuum (off-lattice)^ Go-like formulations with essentially ad- 
ditive interaction schemes all led to chevron rollovers — a hallmark of folding kinetics 
that are often operationally referred to as non- two-state.^ Apparently, many-body in- 
teractions are needed to produce chevron plots with linear folding and unfolding arms 
consistent with a two-state description of equilibrium thermodynamics. ^° 

Small single-domain proteins are characterized as well by a significant correlation 
between relative contact order (CO) and folding rate.^^ Therefore, it is only logical to 
require a model protein interaction scheme to produce a similar correlation. ^^'^^ Ising- 
like^^'^° and other^^'^^ constructs without explicit chain representations have had suc- 
cesses in this regard. However, as for thermodynamic and kinetic cooperativities, achiev- 
ing the CO dependence requirement in models with explicit chain representations appears 
to be a nontrivial task. Notably, an early lattice model study using a 20-letter alpha- 
bet suggested that proteins with higher CO should fold faster, thus predicting a trend 
opposite^^ to that for real single-domain proteins. ^^'^^ A more recent 20-letter lattice 
model investigation, on the other hand, found modest correlations between increasing 
CO and longer logarithmic folding time (correlation coefficient r fa 0.70-0.79 for chain 
lengths > 54).^^ An earlier continuum Go model studies of 18 proteins also found a mod- 
est correlation between increasing CO and slower logarithmic folding rates (r = 0.69).^^ 
But the corresponding dispersion in simulated folding rates covers only ^1.5 orders of 
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magnitude, which is much narrower than the ~ 5 orders of magnitude covered by the 
real folding rates of the proteins in the given dataset. When a different potential func- 
tion was used in a more recent continuum Go model analysis, however, no correlation 
between CO and simulated folding rates was discerned. 

Recently, based on lattice 27mer simulations, Jewett et al.^^ have proposed that en- 
hanced thermodynamic cooperativity and many-body interactions — which are basic 
properties of individual two-state proteins to begin with^~^° — may also be a key to un- 
derstand the correlation between CO and folding rate across different proteins. This is 
an attractive and insightful idea. However, the particular way in which thermodynamic 
cooperativity was enhanced by these authors led only to modest increases in folding 
rate dispersion relative to that for the corresponding lattice Go models with pairwise 
additive contact energies. Both the dispersion in folding rates and the correlation of 
logarithmic folding rate with CO (r = 0.75) for the most cooperative interaction scheme 
they reported were similar to that obtained from an earlier continuum Go model study,^^ 
as well as that from a recent simulation of 20-letter lattice models^^ with only pairwise 
additive contact energies (see above). In our view, these results suggest that while CO- 
dependent folding may well derive from certain intraprotein interactions that are also 
responsible for high thermodynamic cooperativity, CO-dependent folding does not arise 
from thermodynamic cooperativity per se. In other words, how cooperativity is achieved 
can be critically important. Many a priori many-body mechanisms are consistent with 
high thermodynamic cooperativity. An example is the two rather different interaction 
schemes we considered in ref. 10 — one involves local-nonlocal coupling while the other 
assigns an extra favorable energy to the ground-state structure as a whole. But per- 
haps not all such mechanisms can mimic experimentally observed CO dependencies to 
the same degree. Therefore, to shed light on the physical mechanisms of CO-dependent 
folding, we endeavor to construct an interaction scheme that would provide larger dis- 
persions in folding rates and better correlations with CO. 



MODELS AND METHODS 

The present study focuses on the idea of a cooperative interplay between local con- 
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formational preferences and the contact-like interactions that drive the packing of the 
protein core.^'^'^'^'' We have shown that chain models embodying this idea can lead to 
calorimetric cooperativity and simple two-state kinetics/^ although our exploration thus 
far has been limited to model proteins that are mostly helical. '^'^'^'^'^ Here we consider a 
general formulation of this idea, the basic ingredients of which are described by Fig. lA. 
This hypothesis may be viewed as a synthesis of the local-dominant and the nonlocal- 
dominant perspectives.^^ We were motivated by the recognition that both locaP^'^° and 
nonlocal^^'^^ intraprotein interactions are important determinants of protein structure 
and stabihty. Yet local conformational preferences alone are often insufficient for stable 
secondary structures under physiological conditions. Secondary structure formation is 
known to be context dependent;^^ they are stable when packed in the core of a protein 
but are usually not stable in isolation (ref. 31 and references therein). Furthermore, con- 
formational space grows exponentially with chain length, even when preferences arising 
from local excluded volume effects are taken into account. It follows that a large part 
of the stability and uniqueness of protein native structures cannot be explained by local 
interactions alone. On the other hand, our recent Go-model studies have shown that 
nonlocal contact-like interactions by themselves are not cooperative enough for simple 
two-state kinetics^'^"^'^ if they are not coupled to local conformational propensities. 

A simple model of local-nonlocal coupling 

Here we explore the hypothesis in Fig. lA by incorporating its form of local-nonlocal 
coupling into a new interaction scheme in Fig. IB for explicit-chain models configured on 
three-dimensional simple cubic lattices. This allows the idea to be tested quantitatively. 
Fig. IB may be viewed as a generalization of similar constructs we have used previously 
in the context of helical proteins. As a first step in our inquiry, we make the simpli- 
fying assumption that the interactions are native-centric, ^^^^^'^^'^^^'^^ in that only native 
interactions are favored, while nonnative interactions are neutral (have zero energy). The 
local-nonlocal coupling in Fig. IB involves nonadditive many-body interactions. A chain 
segment which is locally nativelike (with native bond and torsion angles) but make no 
native contact is not stabilized (contributing zero energy). On the other hand, nonlo- 
cal contact interactions between monomers far apart along the chain sequence are more 
favorable when the chain segments around the contacting residues are in their native con- 
formations than when they are not. As such, the present model differs from models that 
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additively combine contact energies and local favorabilities.^^ The importance of non- 
additive many-body effects in protein folding has been recognized, but they 
have not been used extensively to model calorimetric two-state cooperativity and linear 
chevron plots. '^^^'^ Our aim here is to utilize extremely coarse-grained representations as 
a computationally efficient means to explore the general principles linking CO-dependent 
folding and proteinlike cooperativities. Many structural and energetic details of real pro- 
teins are beyond the scope of this work. In particular, the present work does not deal 
with the microscopic physical origins of local-nonlocal coupling. Instead we just presume 
that its presence in naturally occurring proteins could arise from evolutionary design. 
Because of these, the simple interaction scheme in Fig. IB should be viewed only as a 
tentative model in this regard. 

In order to examine the folding rates of a set of model proteins whose native structures 
cover a diverse range of CO values, we now consider chains of length n — 21 configured 
on simple cubic lattices. For these 27mers, there are 103,346 distinct maximally compact 
conformations (not related by rotations or inversions) ^^'^^ confined to a 3 x 3 x 3 cube. 
The distribution of CO among these maximally compact conformations covers 97 differ- 
ent values^^ from CO = 208/756 = 0.275 to 402/756 = 0.532 (inset of Fig. 2A, where CO 
is computed using equation 1 of ref. 16). For each CO value, we randomly choose a maxi- 
mally compact 27mer conformation as the native structure of a model protein (Table I).* 

Folding and unfolding kinetics are modeled by standard Monte Carlo simulations 
using the Metropolis criterion and the elementary chain moves of end flips, corner flips, 
crankshafts, and rigid rotations. The relative frequencies of attempting these moves are 
4.7%, 58.3%, 27%, and 10% respectively (c.f. ref. 6)''" Time is measured by the number 
of attempted Monte Carlo moves for a given process. The set of elementary chain moves 
is chosen to mimic physically plausible processes. Lattice model kinetics are dependent 
on the choice of move set.^^ Nonetheless, we expect the general trend predicted by the 
model is less sensitive to move set when kinetics are not dominated by trapping events, 

* Since the present choices of structures are independent of that by Jewett et al.,^'' the structures 

listed in Table I do not necessarily coincide with those used in their study. 

■^The following typographical error in ref. 6 should be corrected. The relative attempt frequencies of 

corner flips and crankshafts used in this prior study of ours were, respectively, 60.6% and 27%, not the 

27% and 60.6% stated on p. 901 of ref. 6. 
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as is the case here and has been verified by Jewett et al.^^ Progress towards the native 
state is tracked by the fractional number of native contacts Q (ref. 3-6). To ascertain 
the impUcations of the local-nonlocal coupling we proposed, results from a highly co- 
operative interaction scheme with a = 0.1 are compared with that from the additive 
scheme (a = 1) of common Go models (c.f. Fig. IB). Folding trajectories are initiated 
at a randomly generated conformation; folding first passage time is defined by the for- 
mation of the Q = 1 ground-state conformation. Unfolding trajectories are initiated at 
the ground-state conformation; unfolding first passage time is the time it takes for the 
chain to be left with three or fewer native contacts {Q < 3/28); Q = 3/28 is chosen to 
define unfolding because it coorresponds approximately to the free energy minimum for 
the denatured state. 



RESULTS 

Sensitivity of folding rate on CO enhanced by local-nonlocal coupling 

Fig. 2 provides the correlation between CO and folding rate among our 27mer mod- 
els. It shows clearly that the local-nonlocal coupfing mechanism postulated in Fig. 1 can 
lead to a significant enhancement of correlation as well as much increased sensitivity of 
folding rate to CO. Whereas the dispersion in folding rates among the common additive 
Go models in Fig. 2A covers only approximately one order of magnitude (a factor of 
ten) and the logarithmic folding rates exhibit only a relatively weak correlation with 
CO (correlation coefficient r = 0.63), the corresponding dispersion among the a = 0.1 
cooperative models in Fig. 2B covers approximately 2.5 to 3 orders of magnitude, with 
a strong correlation between CO and logarithmic folding rate (r = 0.914) comparable to 
that observed among a selection of real, small single-domain proteins. Similar to the 
corresponding experimental situations, ^^'^^ the comparisons in Fig. 2 were performed un- 
der conditions for which folding relaxation is essentially single-exponential, as is evident 
from the good agreements in Fig. 2 between median first passage time divided by In 2 and 
the corresponding mean first passage time.^'^^ To better delineate the effects of having 
weakened contact interactions when the chain segments locally adjacent to the contact- 
ing residues are nonnative, several a values other than the a = 0.1 used for the main plot 
are compared in the inset of Fig. 2B. It shows CO-dependent folding at different levels of 
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local-nonlocal coupling (different a values) for several 27mers with representative CO's. 
The a = case here corresponds to complete interdependence between nonlocal contact 
and local structure. This inset indicates that sensitivity of folding rate to CO increases 
(the fitted line has a more negative slope) with decreasing a, and that the behavior of 
the a = 0.1 models is very similar to that of the a = models. These results further 
affirm that local-nonlocal coupling is a key ingredient for the good correlation between 
CO and fold rate in these models. Nevertheless, as for real proteins, ^^'^^ despite the 
good correlation, CO by itself cannot predict folding rates of the present models with 
high accuracy. Folding rates here can vary significantly for different structures with the 
same CO as well. For example, for the particular 27mer with CO = 346/756 = 0.458 in 
Fig. 2B, the datapoint logiQ(folding rate) = —5.75 may be viewed as an "outher" vis-a- 
vis the fitted hne. However, for two other 27mers with the same CO but do not belong 
to the randomly chosen set in Table I (and therefore not plotted and not used in the 
correlation analysis of Fig. 2B), we found log^o (folding rate) = —7.26 and —7.60, which 
happen to be much closer to the fitted line in Fig. 2B. The reasons behind variations in 
folding rates among structures with same CO remain to be elucidated. 

A consistent model of thermodynamic and kinetic cooperativity 

Fig. 3 provides further analyses of the folding/unfolding kinetics of one example 27mer 
structure we choose to study in more detail. Consistent with our previous results,^'^~^° 
it shows that the model chevron plot^^ predicted by the common additive Go poten- 
tial (upper plot) deviates significantly from simple two-state kinetics in that it exhibits 
a severe rollover under only moderately native conditions. More specifically, for this 
case rollover becomes significant at £/k-QT values that are only slightly more negative 
(more favorable to folding) than that of the transition midpoint {S/ksT ^ —1.43). In 
contrast, the chevron plot predicted by the model with a substantial local-nonlocal cou- 
pling (lower plot) is qualitatively similar to that of real, small single-domain proteins 
that fold and unfold with simple two-state kinetics. ^'^ In particular, it has essentially 
linear folding and unfolding arms over an extended range of E/k-oT values. We have 
also obtained for this model the equilibrium free energy of unfolding AGu as a function 
of £/k-QT^ where AGu here is taken to be that between the unique Q = 1 conforma- 
tion and those with Q < 3/28. (The same definition is used for unfolding kinetics as 
stated above.) Because AGu is essentially linear in Ejk^T^ the linearity of the chevron 



8 



arms over an extended S/k^T range implies an essentially linear relationship between 
folding/unfolding rates and AG^ within the corresponding regime (i.e., the model param- 
eter S may be eliminated in favor of the lower horizontal scale in Fig. 3). Furthermore, 
comparing the mean first passage times in Fig. 3 versus the corresponding median first 
passage times divided by In 2 shows that folding or unfolding relaxation for this model is 
essentially single exponential^'^^ for AG^ < IO/cbT"- Essentially single-exponential fold- 
ing under moderately folding conditions is further demonstrated by an approximately 
linear logarithmic distribution of first passage time^'^'^^ shown in the inset. Similar 
to the cooperative models we recently investigated, for the model with local-nonlocal 
coupling in Fig. 3, the thermodynamic AGu values matches well with the kinetically 
obtained quantity — A;BT'ln[(folding rate)/(unfolding rate)] for AG^ ranging from IO/cbT" 
to —6kBT (lower V-shape). In other words, the folding/unfolding kinetics of this model 
is simple two-state^'^~^° within a AG^ range quite similar to that experimentally acces- 
sible to small single-domain proteins. ^° Finally, the cooperative model in Fig. 3 is also 
calorimetrically two-state. Assuming that the interactions are temperature independent, 
the model's van't Hoff to calorimetric enthalpy ratio AHy^/ AHcai (^2 without baseline 
subtraction^) is determined to be 0.992 (detailed calculation not shown), satisfying the 
requirement of AH^n/ AHcai ~ 1 for two-state thermodynamics. Taken together, the 
above considerations imply that the local-nonlocal coupling mechanism for enhanced 
CO-dependent folding in Fig. 2B also provides — as it should — a consistent account of 
thermodynamic and kinetic cooperativities^'^~^° in simple two-state proteins (Fig. 3). 

As it stands, the transition midpoints of all 27mers considered here with the local- 
nonlocal coupling parametrized by a = 0.1 are very close to one another. This is because 
the interaction scheme in Fig. IB assigns the same energy (= 28£) to every ground-state 
conformation. This is a simplifying assumption in the present modeling setup. Since 
the thermodynamic stabilities of real, small single-domain proteins are quite diverse, 
it is important to note that, in a broader perspective, our hypothesis that significant 
CO-dependent folding can emerge from local-nonlocal coupling is not contingent upon 
the different proteins in question having very similar thermodynamic stabilities. In more 
sophisticated models, for example, an extra favorable energy that differs from one 27mer 
to another may be assigned to the ground-state conformation (i.e., a different Egg term 
as defined in ref. 10 for each 27mer). In that case, the thermodynamic stabilities of dif- 
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fcrcnt 27mers can be very different, but their folding rates would not be affected by this 
extra feature of the model. In other words, the correlation between CO and folding rate 
in Fig. 2B would remain unchanged. As we have recently argued,^'' such extra stabiliz- 
ing energies for the ground state as a whole are physical plausible because experimental 
evidence^° indicates that in real proteins there is a partial separation between the driving 
forces for folding kinetics and the interactions responsible for thermodynamic stability. 



DISCUSSION 



Energy landscapes of the present models are further characterized in Fig. 4 for three 
representative structures with low, intermediate, and high CO values. In this figure, the 
low- and high-CO structures are, respectively, the fastest and slowest folding among the 
97 structures in Table I, whereas the intermediate-CO structure is the one analyzed in 
Fig. 3. For the common additive Go potential, energy E is directly proportional to Q 
{E — £Q). However, for the cooperative models with local-nonlocal coupling, there are 
multiple energy levels for each Q, with E — £Q as the lower bound (left panels of Fig. 4). 
This means that, on average, the energetic separations between non-ground-state and 
ground-state conformations in the cooperative models with local-nonlocal coupling are 
larger than that in the additive Go models. This feature is demonstrated directly in the 
right panels of Fig. 4, which show that the number of non-ground-state conformations 
within a given energy range is smaller for the cooperative models than for the additive 
Go models except for the highest energies {E ^ 0). It follows that the overall ther- 
modynamic cooperativities of the models with local-nonlocal coupling are substantially 
higher than that of the corresponding additive Co models. This behavior is expected 
as well from our recent finding that simple two-state folding/unfolding kinetics (Fig. 3 
above) requires "near-Levinthal" thermodynamic cooperativity.^° Indeed, for the three 
models in Fig. 4 with local-nonlocal coupling, the van't Hoff to calorimetric enthalpy 
ratios Ai^vn/Aifcai are, from top to bottom, K2 = 0.972, 0.992, and 0.998. These 
values are extremely high for model enthalpy ratios without baseline subtractions.^ In 
contrast, the corresponding additive Co models are less cooperative, with K2 = 0.751, 
0.861, and 0.878. Here it is noteworthy that the additive Go models' AHyn/ AHcax ratios 
even after empirical baseline subtractions,^ ^2^'* = 0.885, 0.961, and 0.962, are lower than 
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the Aifvn/^-f^cai ratios of the cooperative models in the absence of basehne subtractions. 

Contact-order dependence indicative of special mechanisms of cooperativity 

Obviously, thermodynamic cooperativity is a necessary ingredient for any protein 
chain model that purports to rationalize the generic properties of small single-domain 
proteins. For the particular interaction scheme we consider, the above analysis shows 
that features that give rise to significant CO-dependent folding also lead to high ther- 
modynamic cooperativity. However, the converse is not necessarily true. More in-depth 
considerations and a comparison of the present results with that of Jewett et al.^^ in- 
dicate that higher thermodynamic cooperativity per se does not necessarily give rise to 
more enhanced dependence of folding rate on CO. Our reasoning is as follows. First, for 
the present set of 27mer structures we have chosen randomly, the correlation between 
logarithmic folding rate and CO is quantified by r = 0.63 (r^ = 0.39) for the additive 
Go interaction scheme. Despite that this correlation happens to be weaker than that of 
Jewett et al.'s collection of additive Go models (their = 0.51), after cooperativity is 
enhanced by local-nonlocal coupling, the correlation between logarithmic folding rate and 
CO for our a = 0.1 models is much higher (r^ = 0.84, see Fig. 2 above, an improvement in 
value of 0.33)* than the best case reported by Jewett et al.^^ (r^ = 0.57 for their s = 3, 
an improvement in value of 0.06 over that for their additive Go models).^ Second, the 
folding rates of our cooperative models are much more sensitive to CO, covering 2.5 to 3 
orders of magnitude, whereas those of Jewett et al. cover only approximately 1.3 orders 
of magnitude. This means that the present local-nonlocal coupling mechanism is signifi- 
cantly more effective in enhancing CO dependence than the nonlinear E-Q relationship 
postulated by Jewett et al. (equation 1 of ref. 27). Their interaction scheme docs not 
make direct reference to chain conformations as such. Thermodynamic cooperativity is 
enhanced in their models by stipulating that the total contact energy E (for a given 
conformation as a whole) does not decrease (docs not become more favorable) linearly 

■t Because all the model ehains in the present study have the same length and the same number of 
native contacts, their correlation coefficient between folding rate and CO is the same as that between 
folding rate and the total contact distance (TCD) defined in ref. 51. 

^If the s = 3 interaction scheme of Jewett et al. is applied to the present set of structures and kinetic 
models, we found = 0.65 for the correlation between CO and folding rate. In this case, the folding 
rates span w 1.8 orders of magnitude; see ref. 52 for details. 
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with increasing Q as in common Go models; but rather decreases at progressively faster 
and faster rates when Q is closer to unity.^ Third, in fact, if thermodynamic coopera- 
tivity is further increased in the interaction scheme of Jewett et al. by increasing their 
s parameter, the energy landscape will eventually become a Levinthal golf course in the 
s ^ oo limit. In that case, folding would be rate-limited by random conformational 
search and CO-dependence would be all but eliminated. Fourth, in this connection, we 
have recently considered three 27mer models with CO = 0.28, 0.40 and 0.51 in a separate 
study. The thermodynamic cooperativity of these models are enhanced by assigning an 
extra stabilizing energy to the ground state but without local-nonlocal coupling. For 
the energetic parameters we considered, the folding rates of these models cover less than 
an order of magnitude. ^° The same set of results also indicated that dispersion in folding 
rates under moderately folding conditions would decrease if thermodynamic cooperativ- 
ity is increased by assigning an even stronger stabilizing energy to the ground state, in 
a manner similar to greatly increasing s in Jewett et al.'s formulation. Taken together, 
these observations lead us to the conclusion that while thermodynamic cooperativity is 
certainly necessary, by itself it is not sufficient to guarantee CO-dependent folding rates 
similar to that observed experimentally^^' if the underlying mechanism for thermody- 
namic cooperativity is not specified. 

CO-dependent folding highlights the important role of local interactions in determin- 
ing folding rates. It suggests that the mechanism of folding may involve relatively 
fast formation of local structure. In this regard, we note that under the general lattice 
scheme in Fig. IB, formation of strong (unattenuated) native contacts with contact order 
— = 3 is relatively easier than formation of strong native contacts with higher contact 
orders. This is because in the \ j — i\ =3 case there is an overlap between parts of the two 
local segments that have to be nativelike in order for the contact to be strong. Physically, 
how a general mechanism similar to that in Fig. 1 may arise in real proteins from solvent- 
mediated atomic interactions such as sidechain packing and hydrogen bonding remains 
to be elucidated. Many basic issues will have to be tackled to address this question. 
For example, correlations between backbone and sidechain rotamer conformations^^ may 
contribute to such a mechanism. Another possibility is that aspects of anti- cooperativity 

^Jewett et al. suggested that the "extraordinary cooperativity in protein folding" may originate 
from "three-body interactions." But how three-body interactions might lead to their E-Q relationship 
remains to be elucidated. 
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of certain types of hydrophobic interactions^^ may help disfavor premature nonspecific 
hydrophobic collapse (which would lead to kinetic trapping^"^) when the sidechains are 
locally less well packed than that in the native state. If this is the case, it could give rise 
to local- nonlocal coupling mechanisms similar to that postulated in Fig. 1. 

In summary, while the models used in the present study are rudimentary, they provide 
strong evidence that a cooperative interplay between local conformational preferences 
and nonlocal favorable contact-like interactions is an important mechanism in account- 
ing for experimentally observed CO-dependent folding of small single-domain proteins. 
We are optimistic that more rigorous applications of the CO-dependence constraint as 
well as the thermodynamic and kinetic cooperativity requirements would help further 
narrow down theoretical possibilities and thus contribute to a more realistic understand- 
ing of protein energetics. 
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Table I 





conformation 




conformation 


208 


uufddfuurddbuubddruufdclfuu 


306 


uufrrbldrfdfiuruUddburdbr 


210 


uufddfuurbbdfFdbbrffuubdbu 


308 


uufdfrbrbulddrfllfrruublfl 


212 


ufdfuubbrddfFubufrddbuubdd 


310 


ufrulblfddrrbbllfuburdrfub 


214 


uuffdbdfrbufubbddrffuubdbu 


312 


uufrbbllffdrrdllbubdrurfdb 


216 


ufdfuubbrddfuufddruubbddfu 


314 


uufdrubbdfdfllbbuffubbrddr 


218 


ufdfuubbrdfufddbbruuffddbu 
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ufTrddblbruufdllbdifrulubb 
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uuffddburfdbbuuffrddbuubdd 
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uufrbddbuullffdrrdllbubdru 
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uuffddburdfuubbddruuffdbdf 
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uufddfrruubbdfdbluuffldrbd 
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uufddrbufubrfdbdffluriilldd 
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uffdbrbrfufullbbrdrufldfdr 
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uufddfuurddbubdrffubbulfrf 
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ufrbddlfrfluruUbbddffubrr 
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ufrdbrbuffdrbbuffubbllfrfl 
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ufrfddluulddbbuufdrdbrfubu 
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ufdrbufublffddruurddbuubdd 
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ufrubdbuldldrrfHlbufurblb 
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uufrrblddrufdluldfurdruull 


330 


uufdrubblddlfubuffddrrbbuf 
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uufddfuurddrbluurfdbbulddr 
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uuffrddruubbdfllfdbrrbluuf 
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ufddbbuurrflfrdlbdfrbubldr 
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ufrrdbdfluldbbrruuflbldrfd 
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uufFdbdfrrblbrulfFrulbbrfd 
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uffrbrbuflblffrrddllbrrblu 
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ufFdbrfurdbblufrbufHlbbrf 
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uffurrdldrbbluruUfrrdldlf 
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ufrrbbdffdlbrbllfFurbubldr 
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ufruUbrrblldrrdllfufdrrbu 
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248 


ufdfrbuflurblbrrdldrffuubd 
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uuffddrrbbuufdfuldbubddflu 
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ufddbrfruublfdbrdblluurdru 
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uufrbbdlulddrrffllbuufdrrb 
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ufddbrbhirrdfilTibrfiilbrbll 
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ufdfurdnmllbbrddrflinirdbu 
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ufrfdrbufubbllfrflddbrburd 
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ufrbdflfrrbbuuUfrrdfuUdr 
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uffrrbdbuuUffrrbldbdflfrr 
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uufdrfdruubbddluufflddbrru 
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ufdrbdlfrrubdblluurffrbbdl 


374 


uffrdrbbuullffrrdbuldbdflf 



. . . Table I to be cont'd 
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Table I . . . {cont'd from last page) 



278 


ufddrrbllbrrullurrfHbdfrb 
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uffrddllbuubddrfrbuufdlflu 
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uffrddlubdruubddllfubuffdd 
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ufdrfdlluubbdfdbrfrbuuffld 
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ufdfniUddbuubddrffrbuubdd 
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ufrfddllubdrrblluuffrdbrbu 
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ufdfurddlluubbdfdbrfrbuufd 
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ufrufrbbddffllbrbuulffdrrb 
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ufrdlluurrbbdfdbllfuubdruf 







Table I. The ground-state 27mer conformations (n = 27) used in this investigation 
are given by sequences of 26 bond directions, where r = right {+x), 1 = left (— x), f = 
forward b = backward {—y), u = up {+z), d = down {—z). A structure is randomly 

selected for each of the 97 possible CO values amongst the compact 27mer structures 
with tmax = 28 contacts. Each integer is the sum of \j —i\ over the (i.j) nearest- 

neighbor contacts in the given conformation {j — i > 3). Here CO — J2 ^Sij / {ntiaax) 
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Figure Captions 



Figure 1. (A) Schematics of local-nonlocal cooperative energetics in protein folding. 
The conformation in the solid box represents the native (N) structure; the two filled 
circles depict a pair of nonlocal residues interacting favorably in the native state. The 
interaction strength between a residue pair is strong and essentially the same as that 
in the native structure if the chain segments sequentially local to both residues are na- 
tivelike, as in (i). [Dotted boxes in (A) are used to mark nativelike chain segments.] 
However, the interaction strength is weakened if one or two chain segments sequentially 
local to the interacting residues are not nativehke, as in examples (ii)-(iv). (B) A lattice 
implementation of this protein folding scenario. Here the favorable energy for every con- 
tact (between residues i and j, \j — i\ > 3) in the ground-state native (N) structure is S 
(< 0) when the relative positions of the five residues centered at i (residues i — 2, i — 1, 
i, i + and i -|- 2) as well as the relative positions of five residues centered at j (residues 
j — 2, j — 1, j, j + 1, and j + 2) are the same as that in N [sohd hnes in (i)], irrespective 
of the relative orientations of the two five-residue chain segments. However, if the lo- 
cal conformation of one or both sets of five contiguous residues is nonnative, the contact 
energy is weakened by an attentuation factor a (0 < a < 1). Examples of the latter situa- 
tion is given by (ii)-(iv), where nonnative local chain segments are drawn as broken lines. 

Figure 2. Correlation between the common (base 10) logarithm of folding rate and 
CO for the 97 structures in Table I under moderately folding conditions at S/kBT = 
—1.47, using (A) the common additive Go potential and (B) the local-nonlocal coopera- 
tive interaction scheme with a = 0.1. Solid lines are least-square fits. Here folding rate is 
the reciprocal of mean folding first passage time (folding rate = 1/MFPT). Each MFPT 
is averaged from 500 trajectories. Associated with each value of log^^Q (1/MFPT) (filled 
circle) is an open circle marking the common logarithm of the median folding first pas- 
sage time (FPT) divided by In 2. If the kinetics is single-exponential, MFPT = (median 
FPT)/ln2. The inset in (A) is the distribution of CO among the 103,346 maximally 
compact 27mer conformations, wherein the number of conformations (vertical scale) is 
shown as a function of CO (horizontal scale). The inset in (B) uses six representative 
structures with different CO values (E = 208, 224, 268, 310, 348, and 386 entries 
in Table I) to illustrate that log^o (folding rate) (vertical scale) is more sensitive to CO 
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(horizontal scale) when the local-nonlocal coupling is stronger. In this inset, different 
symbols denote different a values; the lines fitted through the symbol are, from top to 
bottom, for a = 1, 0.75, 0.5, 0.25, 0.1, and 0.0. 

Figure 3. Model chevron plots for a CO = 0.410 structure {Yl^Sij = 310 entry 
in Table I) are given by negative natural logarithm of MFPT as a function of 8/k-QT 
(filled symbols). Values of (median FPT)/ln2 are shown by the open symbols. Squares 
(folding) and triangles (unfolding) are for the additive Go potential (a = 1, upper plot), 
whereas circles (folding) and diamonds (unfolding) are for the a = 0.1 local-nonlocal co- 
operative interaction scheme (lower plot). Each MFPT is averaged from 500 trajectories, 
except for the model with local-nonlocal coupling at S/k-sT — —1.47 (arrow). For this 
particular case, 7,500 folding trajectories were simulated to provide enriched statistics 
for the FPT distribution in the inset, wherein P(t) At is the fraction of trajectories with 
t-At/2< FPT <t + At/2, and the bin size At for FPT is equal to 5 x 10^ The free en- 
ergy of unfolding AG^ for the a = 0.1 cooperative model is computed using Monte Carlo 
histogram techniques based on sampling at the transition midpoint S/k-oT — —1.33. 
AGu is essentially hnear in £ (lower horizontal scale). The dotted V-shape, which fits 
well to the kinetic datapoints of the a = 0.1 cooperative model over an extended regime, 
is an hypothetical simple two-state chevron plot consistent with the dependence of AG^ 
on £. 

Figure 4. Energy landscapes of three representative models with local-nonlocal 
couphng (a = 0.1, Y^^Sij = 224, 310, and 386 entries in Table l; S = —1). The 
left panels show the correlation between E and Q; each dot indicates that at least 
one conformation with the given {E, Q) was encountered in our sampling. The right 
panels show these structures' logarithmic densities of states, where g{E) is the number 
of conformations with energy E for the cooperative models (a = 0.1, dots). Included for 
comparison are the In g{E) values of the corresponding additive Go models (a = 1, open 
circles; S = —1). The densities of states here are estimated by Monte Carlo sampling at 
the models' transition midpoints £ /k-BT — —1.33 (a = 0.1) and £ /k-QT = —1.43 (a = 1). 
Note that the cooperative models have more energy levels than the additive models. 
Therefore, to compare their densities of states on an equal footing, the open squares 
provide the natural logarithm of the number of conformations in the a = 0.1 cooperative 
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models with energies in the range m — 0.5 < E < m + 0.5, where m = 1,0, —1, —2, . . . 
is an integer. Now the densities of states represented by the open squares (a = 0.1) are 
directly comparable to that represented by the open circles (a = 1) because their values 
are based upon the same unity bin size for E. 
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